News Archive

San Diego Supercomputer Center Houses the National Metabolomics Data Repository

NIH-Funded Project Advances the Study of the Body’s Metabolic Systems

Published September 14, 2021

Founded on the principles of FAIR (findable, accessible, interoperable, and reusable) data, the National Metabolomics Data Repository is housed at the San Diego Supercomputer Center and has implemented a robust set of studies that provide metadata on a range of worldwide research including 196 cases involving cancer and 66 regarding diabetes.  Credit: National Metabolomics Data Repository

Have you ever wondered how the “normal” resting heart rate for your age group, used by your doctor, was determined or how the American Diabetes Association established the “normal” fasting glucose value? Understanding these “normal” ranges for metabolism is complex, especially because the human body may contain tens of thousands of metabolites at any one time; each individual molecule could be tied to a variety of processes that together make up the human metabolism.

Thanks to funding from the National Institutes of Health (NIH), a group of scientists – including San Diego Supercomputer Center (SDSC) Bioinformatics Researcher Eoin Fahy and SDSC Distinguished Scientist Shankar Subramaniam – have recently made a significant leap in ensuring that “normal” numbers are better understood for individuals, rather than the human population as a whole. Fahy and Subramaniam have been working on developing and hosting the  National Metabolomics Data Repository (NMDR) for the past several years.

Rather than remain content to host data for others’ uses, the NMDR team realized they were in a unique situation – with access to hundreds of metabolomics studies in the repository – and developed a tool, MetStat, that provides summary information for “normal” ranges across more than 1,800 metabolomics studies. Instead of examining the “normal” ranges for an overall population, MetStat allows users to look at data regarding specific metabolites within large groups of datasets and to see what is “typical.”

For instance, one of the studies found in the repository encompasses a diabetes study containing data from 12,000 worldwide patient samples. The metadata includes metabolite measurements that can help practitioners better understand individual “typical” ranges for those suffering from diabetes.

“Metabolite markers in blood is our first window into human physiology,” said Subramaniam, NMDR principal investigator and distinguished professor with the UC San Diego Bioengineering Department. “Our project takes this information and creates the ability to study human metabolomics.”

Not only does the repository allow users to access the metadata, there are multiple user-friendly tools in place for researchers to compare and analyze the data combined with their own data. The project is a true representation of one following the FAIR (findable, accessible, interoperable, reusable) principles and Fahy further explained how specific scientists are using the repository.

“Having a large, open-source repository of studies covering a broad range of species, tissue sources, disease associations and analytical instrumentation provides a valuable resource for metabolomics researchers around the world,” said Fahy. “Free access to the underlying raw data and experimental conditions also supports re-analysis and comparative analysis across multiple studies and other ‘omics’ platforms which is likely to lead to key insights in systems biology in the future.”

The NMDR is funded by NIH grant number U2C-DK119886.

About SDSC

The San Diego Supercomputer Center (SDSC) is a leader and pioneer in high-performance and data-intensive computing, providing cyberinfrastructure resources, services and expertise to the national research community, academia and industry. Located on the UC San Diego campus, SDSC supports hundreds of multidisciplinary programs spanning a wide variety of domains, from astrophysics and earth sciences to disease research and drug discovery. SDSC's newest National Science Foundation-funded supercomputer, Expanse, supports SDSC's theme of "Computing without Boundaries" with a data-centric architecture, commercial cloud integration and state-of-the art GPUs for incorporating experimental facilities and edge computing.